{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Julia for Data Science - Plotting\n",
    "\n",
    "### Data visualization: generating nice looking plots in Julia is straight forward\n",
    "In what's next, we will see some of the tools that Julia plotting provides to produce high quality figures for your data. In particular we'll look at\n",
    "\n",
    "1. Plotting mathematical functions\n",
    "1. Visualizing statistics\n",
    "1. Subplotting\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Part 1: plot math functions (specifically latex equations) in our plots"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "using LaTeXStrings\n",
    "using Plots\n",
    "pyplot(leg=false)\n",
    "x = 1:0.2:4"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create three functions and plot them all!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "y1 = sqrt.(x)\n",
    "y2 = log.(x)\n",
    "y3 = x.^2\n",
    "\n",
    "f1 = plot(x,y1)\n",
    "plot!(f1,x,y2) # \"plot!\" means \"plot on the same canvas we just plot on\"\n",
    "plot!(f1,x,y3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can annotate each of these plots! using either text, or latex strings"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "annotate!(f1,[(x[6],y1[6],text(L\"\\sqrt{x}\",16,:center)),\n",
    "          (x[11],y2[11],text(L\"log(x)\",:right,16)),\n",
    "          (x[6],y3[6],text(L\"x^2\",16))])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Part 2: Visualizing statistics\n",
    "\n",
    "**2D histograms** are really easy!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "n = 1000\n",
    "set1 = randn(n)\n",
    "set2 = randn(n)\n",
    "histogram2d(set1,set2,nbins=20,colorbar=true)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Let's go back to our houses dataset and learn even more things about it!**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "using DataFrames\n",
    "houses = readtable(\"houses.csv\")\n",
    "filter_houses = houses[houses[!, :sq__ft].>0,:]\n",
    "x = filter_houses[!, :sq__ft]\n",
    "y = filter_houses[!, :price]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "gh = histogram2d(x,y,nbins=20,colorbar=true)\n",
    "xaxis!(gh,\"square feet\")\n",
    "yaxis!(gh,\"price\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Interesting! \n",
    "\n",
    "Most houses sold are in the range 1000-1500 and they cost approximately 150,000 dollars"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "*Let's see more stats plots.*\n",
    "\n",
    "We can convince ourselves that random distrubutions are indeed very similar.\n",
    "\n",
    "Let's do that through **box plots** and **violin plots**."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "using StatsPlots\n",
    "y = rand(10000,6) # generate 6 random samples of size 1000 each\n",
    "f2 = violin([\"Series 1\" \"Series 2\" \"Series 3\" \"Series 4\" \"Series 5\"],y,leg=false,color=:red)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "boxplot!([\"Series 1\" \"Series 2\" \"Series 3\" \"Series 4\" \"Series 5\"],y,leg=false,color=:green)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "These plots look almost identical, so we do have the same distribution indeed.\n",
    "\n",
    "Let's study the price distributions in different cities in the houses dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "some_cities = [\"SACRAMENTO\",\"RANCHO CORDOVA\",\"RIO LINDA\",\"CITRUS HEIGHTS\",\"NORTH HIGHLANDS\",\"ANTELOPE\",\"ELK GROVE\",\"ELVERTA\" ] # try picking pther cities!\n",
    "\n",
    "fh = plot(xrotation=90)\n",
    "for ucity in some_cities\n",
    "    subs = filter_houses[filter_houses[!, :city].==ucity,:]\n",
    "    city_prices = subs[!, :price]\n",
    "    violin!(fh,[ucity],city_prices,leg=false)\n",
    "end\n",
    "display(fh)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Part 3: Subplots are very easy!\n",
    "\n",
    "To create a plot with subplots, all we need to do is throw the variables bound to individual plots inside another call to `plot`!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "x = -10:.1:10\n",
    "p1 = plot(x, x)\n",
    "p2 = plot(x, x.^2)\n",
    "p3 = plot(x, x.^3)\n",
    "p4 = plot(x, x.^4)\n",
    "plot(p1,p2,p3,p4,layout=(2,2),legend=false)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can create your own layout as follows."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mylayout = @layout([a{0.5h};[b{0.7w} c]])\n",
    "plot(fh,f2,gh,layout=mylayout,legend=false)\n",
    "\n",
    "# this layout:\n",
    "# 1 \n",
    "# 2 3"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Please let us know how we're doing!\n",
    "\n",
    "https://tinyurl.com/JuliaDataScience"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Julia 1.0.5",
   "language": "julia",
   "name": "julia-1.0"
  },
  "language_info": {
   "file_extension": ".jl",
   "mimetype": "application/julia",
   "name": "julia",
   "version": "1.0.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
